AI032
Programming Massively Parallel Processors: A Hands-on Approach
Memory Optimization and Shared Memory Tiling
Learning Objectives
- Understand the hierarchy of GPU memory and latency characteristics
- Identify patterns for global memory coalescing
- Implement 1D and 2D tiling strategies using shared memory
- Analyze and mitigate shared memory bank conflicts
- Compare performance gains of tiled vs. non-tiled implementations